Review


Snippets! How do they work?

First, let’s prep some data


Get data from the palmerpenguins package and inspect names

df <- penguins_raw

df
## # A tibble: 344 × 17
##    studyName `Sample Number` Species       Region Island Stage   `Individual ID`
##    <chr>               <dbl> <chr>         <chr>  <chr>  <chr>   <chr>          
##  1 PAL0708                 1 Adelie Pengu… Anvers Torge… Adult,… N1A1           
##  2 PAL0708                 2 Adelie Pengu… Anvers Torge… Adult,… N1A2           
##  3 PAL0708                 3 Adelie Pengu… Anvers Torge… Adult,… N2A1           
##  4 PAL0708                 4 Adelie Pengu… Anvers Torge… Adult,… N2A2           
##  5 PAL0708                 5 Adelie Pengu… Anvers Torge… Adult,… N3A1           
##  6 PAL0708                 6 Adelie Pengu… Anvers Torge… Adult,… N3A2           
##  7 PAL0708                 7 Adelie Pengu… Anvers Torge… Adult,… N4A1           
##  8 PAL0708                 8 Adelie Pengu… Anvers Torge… Adult,… N4A2           
##  9 PAL0708                 9 Adelie Pengu… Anvers Torge… Adult,… N5A1           
## 10 PAL0708                10 Adelie Pengu… Anvers Torge… Adult,… N5A2           
## # … with 334 more rows, and 10 more variables: Clutch Completion <chr>,
## #   Date Egg <date>, Culmen Length (mm) <dbl>, Culmen Depth (mm) <dbl>,
## #   Flipper Length (mm) <dbl>, Body Mass (g) <dbl>, Sex <chr>,
## #   Delta 15 N (o/oo) <dbl>, Delta 13 C (o/oo) <dbl>, Comments <chr>
# gross!
names(df)
##  [1] "studyName"           "Sample Number"       "Species"            
##  [4] "Region"              "Island"              "Stage"              
##  [7] "Individual ID"       "Clutch Completion"   "Date Egg"           
## [10] "Culmen Length (mm)"  "Culmen Depth (mm)"   "Flipper Length (mm)"
## [13] "Body Mass (g)"       "Sex"                 "Delta 15 N (o/oo)"  
## [16] "Delta 13 C (o/oo)"   "Comments"
# clean those names
df %>% 
  clean_names() %>% 
  names()
##  [1] "study_name"        "sample_number"     "species"          
##  [4] "region"            "island"            "stage"            
##  [7] "individual_id"     "clutch_completion" "date_egg"         
## [10] "culmen_length_mm"  "culmen_depth_mm"   "flipper_length_mm"
## [13] "body_mass_g"       "sex"               "delta_15_n_o_oo"  
## [16] "delta_13_c_o_oo"   "comments"


How about those column names?

A little regex first
😄 Pro tip: match anything that you put in []

# are there spaces or capital letters in col names?
str_detect(names(df), "[\\sA-Z()/-]")
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE
# but we only want one answer, so wrap in any()
any(str_detect(names(df), "[\\sA-Z()/-]"))
## [1] TRUE

😄 Pro tip: str_view() to see string matches (requires htmlwidgets)

# let's see where our pattern matches
str_view_all(names(df), "[\\sA-Z()/-]")


Finally, a snippet!

😄 Pro tip: shift+tab to expand a snippet
😄 Pro tip: cmd+i to fix bad indentation

# snippet time! type "if" then hit shift+tab
# paste in our regex condition and our clean_names code

# try yours right below here:



# should look like this when you're done
if (any(str_detect(names(df), "[\\sA-Z()/-]"))) {
  df <- df %>% 
  clean_names()
}

# highlight the if statement above then hit cmd+i to fix the indentation

# inspect
names(df)
##  [1] "study_name"        "sample_number"     "species"          
##  [4] "region"            "island"            "stage"            
##  [7] "individual_id"     "clutch_completion" "date_egg"         
## [10] "culmen_length_mm"  "culmen_depth_mm"   "flipper_length_mm"
## [13] "body_mass_g"       "sex"               "delta_15_n_o_oo"  
## [16] "delta_13_c_o_oo"   "comments"


What will this return now? Someone tell me before running it!

any(str_detect(names(df), "[\\sA-Z()/-]"))
## [1] FALSE


Can we pipe that? Yes we can. Inside to outside.

😄 Pro tip: cmd+shift+m to insert pipe

names(df) %>% 
  str_detect("[\\sA-Z()/-]") %>% 
  any()
## [1] FALSE


Assign the clean names

😄 Pro tip: alt+dash for assignment arrow

df <- df %>% 
  clean_names()


More snippets! Functions

😄 Pro tip: type fun then hit shift+tab
😄 Pro tip: cmd+f for find (and replace)

# type fun then hit shift+tab
# name it clean_if_bad_names
# one arg called x
# put our if statement in the body
# cmd+i to fix indent
# cmd+f to change df to x

# try yours right below here:



# should look like this when you are done
clean_if_bad_names <- function(x) {
  if (any(str_detect(names(x), "[\\sA-Z()/-]"))) {
    x <- clean_names(x)
    x
  }
  x
}


# reset our df back to original with bad column names
df <- penguins_raw

# use our new function
df <- clean_if_bad_names(x = df)


More snippets! For loops

# type for then hit shift+tab:


More snippets! See them all

😄 Pro tip: make your own

The contents of the snippet should be indented below using the tab key (rather than with spaces). Variables can be defined using the form {1:varname}.

Make a snippet called ec with the following lines

library(here)
library(ggplot2)
library(tidyr)
library(dplyr)
library(stringr)
library(purrr)

New Snippet

Everyday Carry - hit shift+tab after the ec below

ec

I use that as lightweight version of library(tidyverse) when I don’t want or need to load all core tidyverse packages - particularly important for a production environment that needs to be trim.


ggplot2 GUI plot builder will provide code

😄 Pro tip: If you are new to ggplot2, you can install the esquisse package, which also installs the ggplot2 builder addin, which you can use to build a plot using a GUI, then have the code, too!


Before you start, click in the code chunk below, then go to the Addins and build your plot. When you click to insert code it will be inserted where you clicked last in your script

# click below here before starting the esquisse addin